fix(server): auto-recover from corrupted SQLite database on startup#1231
fix(server): auto-recover from corrupted SQLite database on startup#1231eggfriedrice24 wants to merge 2 commits intopingdotgg:mainfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use Trivy to scan for security misconfigurations and secrets in Infrastructure as Code files.Add a .trivyignore file to your project to customize which findings Trivy reports. |
|
This PR is useful for genuinely corrupted databases. As a sidenote: a corrupted or invalid row in orchestration_events can lead to a similar startup failure during replay. The same may apply to undefined providers. While ProviderService appaers to handle those more explicitly, ProjectionPipeline.bootstrap and OrchestrationEngine do not. This might become relevant as new providers get added, e.g. #179. |
Refs #961
Problem
When
state.sqlitebecomes corrupted (truncated, overwritten, or contains non-SQLite data), the backend server crashes immediately on startup. The desktop app then restarts the backend in a loop, producing endlessbackend exited unexpectedly (code=1)messages with no recovery path. Users are stuck unless they manually find and delete the database file.The root cause is two-fold:
makeSqlitePersistenceLiveinSqlite.tsblindly passes the database path to the SQLite client with no pre-flight validationNodeSqliteClient.tscallsopenDatabase()as a bare synchronous call insideEffect.gen- when it throws, the error becomes an unhandled defect that crashes the processFix
Corruption detection and auto-recovery (
Sqlite.ts)Before opening the database,
makeSqlitePersistenceLivenow validates the file header. Every valid SQLite file starts with the 16-byte magic stringSQLite format 3\0. If the file exists but has an invalid header:state.sqlite.corrupted.<timestamp>(preserving it for debugging)Safety net for database open errors (
NodeSqliteClient.ts)The bare
openDatabase()call is now wrapped inEffect.trywithEffect.orDie, so any throw during database construction produces a properly reported defect with a clear"Failed to open database"message instead of an opaque crash.Testing
Manually verified with a corrupted database file:
Before: crash loop,
file is not a databaserepeated every ~500ms, process killedAfter:
Header validation was also tested against:
Scope
This is complementary to #964 (draft), which handles the web/UI recovery view for bootstrap snapshot failures. This PR fixes the server side so the backend never gets stuck in a crash loop.
Test plan
state.sqlitewithecho "not a db" > ~/.t3/userdata/state.sqlite, start app, verify recoverystate.sqliteentirely, start app, verify fresh database is created.corrupted.<timestamp>file is created alongside the new databaseNote
Auto-recover from corrupted SQLite database on server startup
makeSqlitePersistenceLivein Sqlite.ts validates the existing database file by checking the 16-byte SQLite magic header before initializing..corruptedbackup and any-wal/-shmsidecar files are removed (best-effort), then a fresh database is created and a warning is logged.openDatabase()is now wrapped inEffect.tryso failures produce aSqlErrorwith the message'Failed to open database'before terminating the fiber.Macroscope summarized 33283db.